Rational kernels for Arabic Root Extraction and Text Classification
نویسندگان
چکیده
منابع مشابه
Rational Kernels for Arabic Stemming and Text Classification
In this paper, we address the problems of Arabic Text Classification and stemming using Transducers and Rational Kernels. We introduce a new stemming technique based on the use of Arabic patterns (Pattern Based Stemmer). Patterns are modelled using transducers and stemming is done without depending on any dictionary. Using transducers for stemming, documents are transformed into finite state tr...
متن کاملKernels and Similarity Measures for Text Classification
Measuring similarity between two strings is a fundamental step in text classification and other problems of information retrieval. Recently, kernel-based methods have been proposed for this task; since kernels are inner products in a feature space, they naturally induce similarity measures. Information theoretic (dis)similarities have also been the subject of recent research. This paper describ...
متن کاملA Markovian approach for arabic root extraction
In this paper, we present an Arabic morphological analysis system that assigns, for each word of an unvoweled Arabic sentence, a unique root depending on the context. The proposed system is composed of two modules. The first one consists of an analysis out of context. In this module, we segment each word of the sentence into its elementary morphological units in order to identify its possible r...
متن کاملString Subsequence Kernels for Text Classification
This paper explores the string subsequence kernel, a kernel function whose feature space is generated by subsequences of strings. This kernel compares two strings based on the number of occurrences of common substrings they contain, where each common substring is weighted based on how contiguous that substring is within the string. Although a recursive definition of the string subsequence kerne...
متن کاملText Classification using String Kernels
We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguously. The subsequences are weighted by an exponentially decaying factor of their full length in t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of King Saud University - Computer and Information Sciences
سال: 2016
ISSN: 1319-1578
DOI: 10.1016/j.jksuci.2015.11.004